Intro

Getting Started

Data pulled from Morningstar Direct on Jan. 24, 2023. Files combined using combine_files.r. Data in M* filtered:

  • != Fund of funds: Yes
  • != US Category: Money Market

57,451 rows, 117 columns

Added column Obsolete to indicate if share class is in existence or obsolete. I assume a blank Obsolete..Date means the share class still exists.

Also added Inception_Year and Obsolete_Year because these variables are used below.

Data filtered to remove:

  • “Load Waived” Share.Class.Type
  • Share classes launched and liquidated during the same 18 month period
  • Share classes launched after 2022

48,761 rows, 121 columns

library(tidyverse) #For Data Analysis
library(lubridate) #For working with dates
library(DT) #For visualizing tables


Full <- read_csv("data_combined.csv",
                 guess_max = 20000) %>%
  rename_all(make.names) %>%
  mutate(
    Inception_Year = year(Inception..Date),
    Obsolete_Year = year(Obsolete..Date),
    Obsolete =
      case_when(!is.na(Obsolete..Date) ~ "Obsolete",
                TRUE ~ "Exists"),
    Life_Span = time_length(interval(Inception..Date, Obsolete..Date), "month")
  ) %>%
  filter(
    Share.Class.Type != "Load Waived" |
      is.na(Share.Class.Type),
    Life_Span >= 18 |
      is.na(Obsolete_Year) |
      is.na(Inception_Year),
    Inception_Year < 2023)

Overview

Net Count

There was a large drop off in launches in 2020 compared to recent years. However it appears rationalization continued at roughly the same levels as recent years.

#A function to count the number of share classes either created or liquidated each year

Year_Count <- function(colz){
 Full %>%
    group_by_at(colz) %>% 
    summarise(Count = n(), .groups = "drop") %>% 
    arrange(desc(.[[1]]))
    }
    

inception_year <- Year_Count(colz = c("Inception_Year")) #Tallys share classes by year created
obsolete_year <- Year_Count(colz = c("Obsolete_Year")) #Tallys share classes by year liquidated

merge_type <- Year_Count(colz = c("Obsolete_Year", "Obsolete..Type")) #Tallys share classes by year liquidated and liquidation type


#Counts the net number of share classes created/cut

Net_Count <- full_join(inception_year, obsolete_year, by = c("Inception_Year" = "Obsolete_Year"),
                       suffix = c("_inception", "_obsolete")) %>% 
  rename(Year = Inception_Year) %>% 
  group_by(Year) %>% 
  mutate(Net_Count = sum(Count_inception, -Count_obsolete, na.rm = TRUE))

New Funds

The data from Morningstar is at the share class level. This is a method to look at launches at the fund level.

#This looks at when a fund's oldest share class was created
#This assumes that the oldest share class's inception date is equal to the fund's creation date

New_Funds <- Full %>% 
  filter(Oldest..Share.Class == "Yes") %>% 
  group_by(Inception_Year, Index..Fund) %>% 
  summarise(Count = n(), .groups = "drop") %>% 
  mutate(
    Pct_Change = round(
      ((Count/lag(Count) - 1) * 100),
      1
    )) %>% 
  arrange(desc(Inception_Year))

New Funds (Alternative)

The method above uses the inception date of the Oldest..Share.Class as a proxy for when a fund was created. This method uses FundId.

The oldest inception date related to a FundID is a proxy for when the fund was created. Or the oldest obsolete date related to a a FundID (with no active share classes) is likely when that fund was liquidated/merged.

There were some cases identified in the data where a FundID had no corresponding Oldest..Share.Class. This could be an error or signal that a fund’s oldest share class is not captured in the data from Morningstar.

The results for 2020 ended up equal using the method above.

This method shows there were 15,825 funds in 2020 vs 15,677 in 2019 – a change of 148 funds. This method shows there were 15,464 in 2018 – a change of 213 funds.

Companies

To look at which firms cut or added share classes I first clean up the company names with Branding.Names.Mod.

company_change <- Full %>%
  mutate(
    Branding.Name.Mod =
      case_when(
        grepl("State Street", Branding.Name) ~ "State Street",
        grepl("TIAA", Branding.Name) ~ "TIAA/Nuveen",
        grepl("Nuveen", Branding.Name) ~ "Morgan Stanley",
        grepl("Eaton Vance", Branding.Name) ~ "Morgan Stanley",
        grepl("Calvert", Branding.Name) ~ "Eaton Vance/Calvert",
        grepl("iShares", Branding.Name) ~ "iShares/BlackRock",
        grepl("BlackRock", Branding.Name) ~ "iShares/BlackRock",
        grepl("PowerShares", Branding.Name) ~ "PowerShares/Invesco",
        grepl("Invesco", Branding.Name) ~ "PowerShares/Invesco",
        grepl("DWS$", Branding.Name) ~ "DWS/Xtrackers",
        grepl("Xtrackers", Branding.Name) ~ "DWS/Xtrackers",
        grepl("Capital Group", Branding.Name) ~ "American Funds",
        TRUE ~ Branding.Name
      ))
    
    

    
    company_level_2018 <- company_change %>%
      filter(Inception_Year <= 2018) %>% 
      mutate(
        Count_2018 =
          case_when(
            is.na(Obsolete_Year) ~ "Exists_2018",
            Obsolete_Year <= 2018 ~ "Obsolete_2018",
            Obsolete_Year > 2018 ~ "Exists_2018",
            TRUE ~ "Other"
      ))
    
    company_level_2018 <- company_level_2018 %>%
      select(Branding.Name.Mod,
                Name,
                FundId,
                Count_2018) %>%
      group_by(Branding.Name.Mod, Name, Count_2018) %>%
      summarise(Tally_2018 = n()) %>%
      pivot_wider(names_from = Count_2018,
                  values_from = Tally_2018)
## `summarise()` has grouped output by 'Branding.Name.Mod', 'Name'. You can
## override using the `.groups` argument.
      company_level_2022 <- company_change %>%
      filter(Inception_Year <= 2022) %>% 
      mutate(
        Count_2022 =
          case_when(
            is.na(Obsolete_Year) ~ "Exists_2022",
            Obsolete_Year <= 2022 ~ "Obsolete_2022",
            Obsolete_Year > 2022 ~ "Exists_2022",
            TRUE ~ "Other"
      ))
    
    company_level_2022 <- company_level_2022 %>%
      select(Branding.Name.Mod,
                Name,
                FundId,
                Count_2022) %>%
      group_by(Branding.Name.Mod, Name, Count_2022) %>%
      summarise(Tally_2022 = n()) %>%
      pivot_wider(names_from = Count_2022,
                  values_from = Tally_2022)
## `summarise()` has grouped output by 'Branding.Name.Mod', 'Name'. You can
## override using the `.groups` argument.
    company_level_change <-
      full_join(company_level_2018, company_level_2022)
## Joining, by = c("Branding.Name.Mod", "Name")
    company_level_change <- company_level_change %>%
      group_by(Branding.Name.Mod) %>%
      summarise(Exists_2022 = sum(Exists_2022, na.rm = T),
                Exists_2018 = sum(Exists_2018, na.rm = T),
                Change = sum(Exists_2022,-Exists_2018, na.rm = T))
    
    
    company_funds <- company_change %>%
      group_by(Name.of..Fund.s..Oldest.Share.Class) %>%
      mutate(
        Newest_ObDate = max(Obsolete_Year),
        Oldest_InDate = min(Inception_Year)
      ) %>%
      group_by(Branding.Name.Mod,
               Name.of..Fund.s..Oldest.Share.Class,
               Newest_ObDate,
               Oldest_InDate) %>%
      summarise() %>%
      mutate(
        Count_2022 =
          case_when(
            is.na(Newest_ObDate) ~ 1,
            Newest_ObDate == 2023 ~ 1,
            Newest_ObDate <= 2022 ~ 0
          ),
        Count_2018 =
          case_when(
            is.na(Newest_ObDate) & Oldest_InDate <= 2018 ~ 1,
            Oldest_InDate > 2018 | Newest_ObDate <= 2018 ~ 0,
            Newest_ObDate > 2018 ~ 1
          )
      )
## `summarise()` has grouped output by 'Branding.Name.Mod',
## 'Name.of..Fund.s..Oldest.Share.Class', 'Newest_ObDate'. You can override using
## the `.groups` argument.
    company_fund_change <- company_funds %>%
      group_by(Branding.Name.Mod) %>%
      summarise(
        Total_2018 = sum(Count_2018, na.rm = T),
        Total_2022 = sum(Count_2022, na.rm = T),
        Difference = sum(Total_2022,-Total_2018, na.rm = T)
      )
    
    company_count <- function(x, colz) {
      company_change %>%
        group_by_at(colz) %>%
        summarise(Count = n()) %>%
        ungroup() %>%
        arrange(desc(.[[2]]))
    }
    
    comp_inception <-
      company_count(colz = c("Branding.Name.Mod", "Inception_Year"))
## `summarise()` has grouped output by 'Branding.Name.Mod'. You can override using
## the `.groups` argument.
    comp_obsolete <-
      company_count(colz = c("Branding.Name.Mod", "Obsolete_Year"))
## `summarise()` has grouped output by 'Branding.Name.Mod'. You can override using
## the `.groups` argument.
    comp_new_funds <-
      company_count(
        colz = c("Branding.Name.Mod", "Inception_Year", "Oldest..Share.Class")
      ) %>%
      filter(Oldest..Share.Class == "Yes")
## `summarise()` has grouped output by 'Branding.Name.Mod', 'Inception_Year'. You
## can override using the `.groups` argument.
    comp_old_funds <-
      company_count(
        colz = c("Branding.Name.Mod", "Obsolete_Year", "Oldest..Share.Class")
      ) %>%
      filter(Oldest..Share.Class == "Yes")
## `summarise()` has grouped output by 'Branding.Name.Mod', 'Obsolete_Year'. You
## can override using the `.groups` argument.
    comp_rank <- function(x) {
      x %>%  filter(.[[2]] == "2022") %>%
        mutate(Rank = rank(-Count)) %>%
        filter(Rank <= 20) %>%
        arrange(Rank)
    }
    
    top_cutters <- comp_rank(comp_obsolete)
    top_old_cutters <-  comp_rank(comp_old_funds)
    
    top_launchers <- comp_rank(comp_inception)
    top_new_launchers <- comp_rank(comp_new_funds)

Top Launchers

(Share Class)

Top New Fund Launchers

(Oldest Share Class)

Top Cutters

(Share Class)

Top Fund Cutters

(Oldest Share Class)

Share Class Overview

##Launches

simp_count <- function(x, colz){
  Full %>%  
    group_by_at(colz) %>% 
    count(name = "Count")
  }

From Morningstar Direct

2022 Launches

##Liquidations

## Warning: Removed 15 rows containing missing values (`position_stack()`).

2022 Liquidations

Other/NA

This is an attempt to determine the share class type in cases where Morningstar has the share classs listed as other or NA.

Changes

## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html

Inception Year

## `summarise()` has grouped output by 'Share_Class_Mod'. You can override using
## the `.groups` argument.

Obsolete Year

## `summarise()` has grouped output by 'Share_Class_Mod'. You can override using
## the `.groups` argument.